An Active Exploration Method for Data Efficient Reinforcement Learning
نویسندگان
چکیده
منابع مشابه
An Accumulative Exploration Method for Reinforcement Learning
Agents in Multi Agent Systems can coordinate their actions by communicating. We investigate a minimal form of communication, where the signals that agents send represent evaluations of the behavior of the receiving agent. Learning to act according to these signals is a typical Reinforcement Learning problem. The backpropagation neural network has been used to predict rewards that will follow an...
متن کاملEfficient Exploration for Reinforcement Learning
Reinforcement learning is often regarded as one of the hardest problems in machine learning. Algorithms for solving these problems often require copious resources in comparison to other problems, and will often fail for no obvious reason. This report surveys a set of algorithms for various reinforcement learning problems that are known to terminate with good solution after a number of interacti...
متن کاملActive Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares policy iteration (LSPI) framework allows us to employ statistical active learning methods for linear regression. Then we propose a design method of good sampling policies for efficient exploration, which is particularl...
متن کاملEfficient Exploration in Reinforcement Learning
An agent acting in a world makes observations, takes actions, and receives rewards for the actions taken. Given a history of such interactions, the agent must make the next choice of action so as to maximize the long term sum of rewards. To do this well, an agent may take suboptimal actions which allow it to gather the information necessary to later take optimal or near-optimal actions with res...
متن کاملEfficient exploration through active learning for value function approximation in reinforcement learning
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares policy iteration (LSPI) framework allows us to employ statistical active learning methods for linear regression. Then we propose a design method of good sampling policies for efficient exploration, which is particularl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Applied Mathematics and Computer Science
سال: 2019
ISSN: 2083-8492
DOI: 10.2478/amcs-2019-0026